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The  load  data  in  smart  grid  contains  a  lot  of  valuable  knowledge,  which  is  useful  for  both  electricity 
producers  and  consumers.  Load  classification  is  an  important  issue  in  load  data  mining.  A  five-stage 
process  model  of  load  classification  is  constructed  based  on  the  summary  and  analysis  of  studies  about 
load  classification  in  smart  grid  environment.  Then,  the  commonly  used  clustering  methods  for  load 
classification  are  summarized  and  briefly  reviewed,  and  the  well-known  evaluation  methods  for  load 
classification  are  also  introduced.  Besides,  the  applications  of  load  classification,  including  bad  data 
identification  and  correction,  load  forecasting  and  tariff  setting,  are  discussed.  Finally,  an  example  of  load 
classification  based  on  Fuzzy  c-means  (FCM)  is  presented. 
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1.  Introduction 

There  is  a  great  difference  in  the  electricity  consumption 
patterns  of  different  types  of  users,  such  as  domestic,  commercial, 
industrial,  agricultural,  etc.  Even  for  the  same  type  of  users,  their 
patterns  of  electricity  consumption  may  be  different.  Mining  the 
electricity  consumption  patterns  of  different  electricity  users  based 
on  load  data  classification  can  not  only  support  the  production 
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Nomenclature 

xj 

jth  data  object 

U 

membership  degree  matrix 

Vi 

d{Xj,Vj) 

cluster  center  of  Q 

Euclidean  distance  of  Xj  to  vt,  d(Xj,ui)=llxJ—i|ill' 

V 

cluster  center  matrix 

uu,v ) 

objective  function  of  FCM 

n 

c 

number  of  data  object 
number  of  clusters 

Subscripts 

Mij 

E 

membership  degree  of  Xj  to  v, 

sum  of  squared  errors  of  all  sample  objects 

i 

ith  cluster 

m 

fuzziness  parameter  in  FCM 

i 

jth  data  object 

Ci 

ith  cluster 

k 

fcth  cluster 

planning,  the  making  of  competitive  market  policies  and  the 
provision  of  more  personalized  electric  power  services  for  electri¬ 
city  producers  [1,2],  but  also  help  different  electricity  users  to 
enhance  the  understanding  of  their  electricity  consumption  pat¬ 
terns.  Moreover,  users  can  adjust  their  electricity  consumption 
strategies  more  economically  and  optimally  based  on  the  knowl¬ 
edge  discovered  from  load  classification.  Hence,  the  electricity 
consumption  costs  will  be  reduced  and  the  energy  use  efficiency 
will  be  improved  more  significantly  [3]. 

Load  classification  is  to  partition  various  load  patterns  into 
groups  so  that  load  patterns  in  the  same  group  are  more  similar  to 
each  other  than  to  load  patterns  in  other  groups  based  on  various 
clustering  algorithms  [4,5],  The  characteristic  load  pattern  is  used 
to  represent  and  describe  the  load  patterns  in  the  same  group. 
Load  classification  is  an  important  part  of  load  modeling,  there¬ 
fore,  the  accuracy  of  load  classification  can  directly  affect  the 
reasonableness  and  effectiveness  of  load  modeling  [6], 

Load  classification  is  a  process  which  consists  of  many  steps. 
Such  as  load  classification  preparation,  load  classification  imple¬ 
mentation  using  clustering  method,  as  well  as  the  understanding 
and  applications  of  load  classification.  The  process  model  and 
specific  steps  of  load  classification  are  presented  in  Section  2. 

While  in  smart  grid  environment  [7],  a  large  amount  of  load 
data  will  be  measured  and  collected  by  advanced  load  measuring 
equipment.  The  scale  of  the  load  data  collected  will  be  larger,  and 
the  structure  will  be  more  complex.  Moreover,  the  form  of  load 
data  in  smart  grid  environment  will  be  more  flexible.  Therefore, 
mining  and  extracting  valuable  knowledge  from  the  massive 
electric  power  load  data  in  smart  grid  environment  is  an  impor¬ 
tant  research  direction. 

Based  on  the  summation  and  analysis  of  existing  research 
about  electric  power  load  classification,  the  five-stage  process 
model  of  load  classification  in  smart  grid  environment  is  estab¬ 
lished  in  Section  2.  The  commonly  used  clustering  methods  and 
result  evaluation  methods  of  load  classification  are  reviewed  and 
summarized  in  Section  3.  Section  4  presents  the  applications  of 
load  classification,  including  bad  data  identification  and  correc¬ 
tion,  load  forecasting,  and  tariff  setting,  etc.  Section  5  gives  an 
example  of  load  classification  based  on  Fuzzy  c-means  (FCM) 
algorithm  [8],  Finally,  conclusions  are  made  in  Section  6. 


2.  Process  model  of  load  classification 

The  electric  power  load  data  in  smart  grid  is  big.  Specifically,  its 
scale  is  large,  its  structure  is  complex  and  heterogeneous,  its 
dimension  is  high,  and  its  form  is  real-time  and  dynamic.  These 
characteristics  make  the  load  classification  in  smart  grid  even 
more  difficult.  Hence,  a  definite  model  of  load  classification  is 
necessary. 

As  it  is  shown  in  Fig.  1,  there  are  five  stages  in  the  process  of 
load  classification,  namely,  load  data  preparation,  load  data 


clustering  preparation,  load  data  clustering  implementation, 
understanding  and  evaluation  of  load  classification  results,  and 
applications  of  load  classification  results. 

The  preparation  of  input  data  for  load  classification  is  the  first 
step.  According  to  the  dimensions  of  time,  regions  and  the  types  of 
substations,  the  power  load  conditions  are  determined  first.  Then 
selecting  sample  data  from  massive  load  data  using  sampling 
methods.  Afterwards,  the  input  sample  load  data  selected  are 
normalized,  and  the  outlier  and  noise  data  should  also  be  identified 
and  corrected. 

Load  data  clustering  preparation  stage  includes  determining 
the  classification  characteristics,  choosing  appropriate  clustering 
algorithm  and  determining  its  corresponding  parameters.  Three 
kinds  of  load  characteristic  indices,  descriptive,  comparative  and 
curved,  are  summarized  in  [9],  The  well-known  clustering  algo¬ 
rithms  used  for  load  classification  are  K-means,  FCM,  hierarchical 
clustering  method,  etc.  The  parameters  in  clustering  algorithm  are, 
for  example,  the  initial  cluster  centers,  number  of  clusters,  and 
fuzziness  parameter  in  FCM,  etc. 

The  third  step  is  to  implement  clustering  algorithm  based  on 
the  pre-processed  load  data,  selected  classification  characteristics 
and  clustering  algorithm  and  its  corresponding  parameters. 

After  the  load  data  clustering,  we  need  to  understand  and 
evaluate  the  classification  results.  The  classification  results  are 
generally  presented  as  a  certain  number  of  groups  of  load  patterns 


Fig.  1.  Process  model  of  load  classification. 
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Fig.  2.  Clustering  methods  can  be  used  for  load  classification. 


Table  1 

Studies  about  four  commonly  used  clustering  methods  for  load  classification. 


Clustering 

algorithm 

Short  description 

Pros  and  cons 

References 

K-means 

A  classical  partitioning  crisp  clustering 
method 

Simple,  efficient  and  scalable;  difficult  to  determine  initial  cluster  centers  and  cluster 
numbers,  sensitive  to  noise  and  outliers,  can  only  be  used  to  spherical  data,  etc. 

[15-18] 

FCM 

A  well-known  local  search  fuzzy  clustering 
method,  also  a  partitioning  method 

Membership  degree  of  fuzzy  partitions  is  introduced;  difficult  to  determine  initial  cluster 
centers  and  cluster  numbers,  easy  to  fall  into  local  optimum,  etc. 

[19-24] 

Hierarchical 

clustering 

method 

The  bottom-up  aggregation  or  top-down 
spit  of  groups 

Easy  to  implement;  difficult  to  select  the  agglomerate  and  split  points,  etc. 

[25-28] 

SOM 

A  kind  of  unsupervised  neural  networks 
method 

Can  identify  the  most  significant  characteristics  with  self-stability,  has  a  strong  ability  of  anti¬ 
noise;  The  learning  efficiency  depends  on  the  input  order  of  sample  objects  when  the  number 
of  objects  is  small.  Be  affected  by  factors  such  as  the  weights  of  network  connection,  the 
adjustment  of  learning  efficiency,  the  selection  of  neighborhood  function,  etc. 

[29-32] 

and  their  corresponding  representative  load  patterns.  The  infor¬ 
mation  and  characteristics  of  each  group  of  load  patterns  need  to 
be  described  and  understood.  In  addition,  cluster  validity  indices 
are  generally  used  to  validate  the  quality  of  clustering  results. 

The  ultimate  goal  of  load  classification  process  is  to  support  the 
decision-making  of  power  systems  participants.  Based  on  the 
knowledge  and  information  discovered  from  load  classification, 
the  demand-side  management  can  be  implemented.  Also,  it  can 
improve  the  practicality  of  bad  data  identification  and  correction, 
the  accuracy  of  load  forecasting,  and  the  appropriateness  of  tariff 
setting. 


3.  Clustering  methods  and  result  evaluation  methods  for  load 
classification 

3.1.  Clustering  methods  for  load  classification 

Clustering  methods  [10,11]  can  be  grouped  into  five  categories 
based  on  the  clustering  criterion,  and  each  category  contains  many 
specific  clustering  methods.  Such  as  partitioning  methods  include 
K-means,  FCM,  PAM,  etc.  All  of  the  clustering  methods  can  be  used 
for  load  classification,  which  are  summarized  in  Fig.  2. 

We  should  note  that  no  one  clustering  method  is  always 
superior  to  the  others  when  they  are  used  for  load  classification, 
as  they  are  used  for  other  applications.  Some  methods  are  more 
commonly  used  for  load  classification  than  the  others  since  they 
are  easier  to  operate  or  better  results  can  be  obtained  by  them. 

We  will  give  a  brief  introduction  to  the  four  commonly  used 
clustering  methods  for  load  classification,  K-means  [12],  FCM  [8], 
hierarchical  clustering  method  [13]  and  self-organization  mapping 


(SOM)  [14],  from  Sections  3.1.1-3.1.4.  The  four  methods  and 
corresponding  references  are  summarized  in  Table  1. 

In  addition  to  the  above  four  commonly  used  methods,  some 
new  methods,  such  as  Support  Vector  Clustering  [33],  FaiNet  [34], 
honey  bee  mating  optimization  [35],  ant  colony  optimization 
algorithm  [36],  fellow  the  leader  [37],  iterative  refinement  cluster¬ 
ing  [38],  and  ISODATA  [39],  etc.  have  also  been  studied  and  used 
for  load  classification. 

Although  there  are  some  differences  in  the  configuration  of 
platforms,  software,  and  hardware  when  different  clustering  meth¬ 
ods  are  used  for  load  classification.  The  key  requirements,  such  as 
load  data  measuring  and  collection  platform  AMR  (Automated 
Meter  Reading),  the  computing  software  MATLAB,  SPSS  or  R,  and 
the  high-performance  computers,  are  all  needed  for  the  implemen¬ 
tation  of  load  classification. 

3.2.3.  K-means  algorithm 

K-means  algorithm  [12]  is  a  kind  of  classical  crisp  clustering 
method  used  for  load  classification.  The  basic  idea  of  K-means  is  that 
selecting  c  initial  cluster  centers  randomly  once  the  number  of 
clusters  c  is  determined,  then  allocate  other  objects  to  their  nearest 
cluster  according  the  distance  between  the  object  and  the  cluster 
centers.  Performing  iterative  operations  until  the  criterion  function 
shown  in  Eq.  (1)  converges  to  a  certain  range. 

E=  D  2)  d(Xj,V;)  (1) 

i  =  1  XjEQ 

The  operation  of  K-means  is  simple,  efficient  and  scalable. 
Hence,  it  is  the  commonly  used  crisp  clustering  methods  for  load 
classification.  However,  the  deficiencies  of  K-means  include:  (1) 
the  selection  of  initial  cluster  centers  can  significantly  affect  the 
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Table  2 

Well-known  CVIs  for  the  evaluation  of  load  classification  results. 


Type 


CVI 


References 


Crisp  clustering 


Fuzzy  clustering 


Dunn  =  min 

i  =  1,— ,c 


j=TA^ 


\ 


where  D(C„C,)  =  min  d(x,y),8(Cj)  =  maxd(x,y) 

xeQ.yeCj  x.yeC, 


/ 


6*3  variants  of  Dunn  based  on  different  D(CitCj )  and  <5(C,) 

c  =  S^CO-WC)'  Where  S(C)  =  22  d(x,,xj),  Smin(C)  =  2min(nw)JIIJ[jeXd(xi,Xj),  Smax(C)  =  2max(nw)JIt^<Ejfd(xi,xJ) 

C^eCXj.XjeC^ 


f.pmmp  —  (^~r)  @  ) 
uamma  —  (S+)+(S_) 


CS  = 


2- = !  hr  2  maxd(Xj,x/) 
'  'xjeCjxleC' 


1  =  (c  x  &  *  max  d(vi.Vj ■))  ,  where  Ec  =  2f_  t  2  d(Xj,vt) 
V  V-l.-.r  /  XyeC, 


^=1n,xd(y,-X) 


xjeQ 


COP  =  1  yc  n 

n  Zji  =  1  "i  minXi.f£C.maxXj.eC(d(Xj,x;) 

pc  =  £  27. ,  27-  ,4  pe  =  - 1 2L ,  ZJ1. 
fs  =  2f,  ,27=  ^(d^.vo-dCvfX)) 

_  ZCi  =  i  d(Xj , V| ) 


NFI  : 


nxmin  dfVj.v,) 


n(c-l) 

I"=  i^”d(x;-,Vf) 


cr  _  Vc  Lj  =  ^ijuy*j-Vi> 

=  1  Z"=i«/Zfc  =  id(vi'’ 

PBMF=  where  E,  =  27_  ,^d(X/,V|),  Dc  =  maxd(v„y,) 

PCAES  =  27=  1 27=  i/‘|/a'm-27 .  i  exp(-min(d(vf,vk)/^T)),  where  //„  =  min27_  ,4 

fr=72f.,d(viX) 

CO  =  C(c,U)-0(c,U).  COr=  1§|$ 


[43] 


[44] 

[45] 

[46] 

[47] 


[48] 

[49] 

[50] 

[51] 

[52] 

[53] 

[54] 

[55] 

[56] 

[57] 

[58] 


algorithm,  (2)  determining  the  appropriate  number  of  clusters  is 
difficult,  (3)  it  is  sensitive  to  noise  and  outliers  data,  (4)  it  can  only 
be  used  to  find  groups  in  spherical  data  set,  etc.  Therefore,  the 
K-means  algorithm  used  in  load  classification  is  usually  modified 
or  optimized  [15-18]. 


v_2f=1^ 

where  m  satisfies 

H9e[  0,1] 


(4) 

(5) 


3.1.2.  FCM 

FCM  [8]  is  a  well-known  local  search  fuzzy  clustering  method.  A 
data  object  in  a  data  set  belongs  and  only  belongs  to  one  group  in 
crisp  clustering.  While  in  the  fuzzy  clustering,  each  data  object 
cannot  be  strictly  clustered  into  a  certain  group,  but  into  more  than 
one  groups  with  a  certain  degree  of  membership  to  each  group. 

FCM  algorithm  starts  with  determining  the  number  of  clusters 
followed  by  guessing  the  initial  cluster  centers.  Then  every  data 
object  is  assigned  a  membership  degree  to  each  cluster.  Each 
cluster  center  point  and  corresponding  membership  degree  are 
updated  iteratively  by  minimizing  the  objective  functions  until  the 
positions  of  the  cluster  centers  does  not  change  or  the  difference 
of  objective  function  values  between  two  iterations  ranges  in  a 
permitted  extent. 

The  objective  function  of  FCM  algorithm  is  defined  as 

Jm(U,V)=  £  1  Myd(xjtVj)  (2) 

>  =  ij  =  i 

The  iterative  procedure  updates  membership  and  the  cluster 
centers  Vi  by 


Ik=i  (d(Xj,v,)/d(Xj,vk))1/m  1 


2  Aj=  1,Y/=  l,...,n  (6) 

i  =  1 

°<  I>,j<n,Vi=  l,...,c  (7) 

i=  i 

The  concept  of  membership  degree  of  fuzzy  partitions  is 
introduced  to  FCM,  and  it  has  become  a  popular  fuzzy  clustering 
method  for  load  classification  [19-24],  However,  FCM  is  also 
sensitive  to  the  initial  cluster  centers,  and  the  cluster  number  is 
also  difficult  to  be  determined.  Moreover,  it  is  easy  to  fall  into  local 
optimum.  All  of  these  factors  can  affect  the  accuracy  and  effec¬ 
tiveness  of  load  classification.  The  research  of  FCM  optimized  by 
intelligent  algorithm  is  an  interesting  direction  [24]. 

3.3.3.  Hierarchical  clustering  method 

The  main  idea  of  hierarchical  method  [13]  is  the  bottom-up 
aggregation  or  top-down  spit  of  groups  in  a  data  set  until  the 
satisfied  classification  result  is  formed.  Each  object  in  the  data  set 
is  regarded  as  a  group,  and  then  form  a  larger  one  by  merging  two 
groups  based  on  a  certain  criterion  (generally  the  distances  among 
clusters)  until  all  the  objects  are  in  a  single  cluster  or  meeting  a 
termination  condition  in  the  agglomerate-type  hierarchical 
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clustering.  The  steps  of  split-type  hierarchical  clustering  are  just 
opposite  to  the  agglomerate-type. 

The  steps  of  hierarchical  method  are  easy  to  implement, 
thereby  it  is  widely  used  for  load  classification.  However,  the 
selection  of  agglomerate  and  split  points  is  difficult,  since  the 
clustering  operation  is  based  on  the  former  steps,  and  the  former 
steps  cannot  be  changed,  the  final  clustering  result  is  directly 
affected  by  the  merging  or  split  operation  in  each  step.  Hence,  the 
hierarchical  clustering  method  is  also  improved  and  modified 
when  used  for  load  classification  [25-28]. 

3.1.4.  Self-organization  mapping  (SOM) 

SOM  [14]  is  a  kind  of  unsupervised  neural  networks  method, 
also  known  as  Kohonen  neural  network.  SOM  is  composed  of  input 
layer  and  competitive  layer.  There  are  N  input  neurons  in  the  input 
layer  and  M  output  neurons  in  the  output  layer.  The  neurons  in 
input  layer  and  output  layer  are  interconnected,  and  the  winning 
neuron  is  the  nearest  one  in  output  layer  to  N  input  neurons. 

Evaluation  function  is  not  needed  in  SOM,  and  it  can  identify 
the  most  significant  characteristics  with  self-stability.  SOM  also 
has  a  strong  ability  of  anti-noise.  All  of  these  make  SOM  being 
widely  used  for  load  classification  [29-32].  However,  the  learning 
efficiency  depends  on  the  input  order  of  sample  objects  when  the 
number  of  objects  is  small.  The  factors  such  as  the  weights  of 
network  connection,  the  adjustment  of  learning  efficiency,  the 
selection  of  neighborhood  function,  can  significantly  affect  the 
performance  of  SOM,  thereby  affect  the  effectiveness  of  load 
classification. 

3.2.  Evaluation  methods  for  load  classification  results 

Since  clustering  is  an  unsupervised  process,  the  load  data 
objects  in  data  sets  are  unlabeled  and  no  structural  knowledge 
about  the  data  set  is  available.  Hence,  measuring  the  quality  of 
clustering  results  and  determining  the  optimal  number  of  clusters 
are  difficult  tasks.  The  most  commonly  used  approach  to  deter¬ 
mine  the  optimal  number  of  cluster  is  to  execute  the  clustering 
algorithm  several  times  with  different  number  of  clusters  and  then 
selecting  the  number  of  clusters  that  provides  the  best  result 
observing  a  predefined  criterion  function.  The  predefined  criterion 
function  is  called  cluster  validity  index  (CV1).  When  the  number  of 
clusters  and  other  parameters  of  clustering  algorithm  are  fixed, 
CVI  can  be  used  to  evaluate  and  validate  the  results  of  load 
classification.  Currently,  a  large  number  of  CVIs  have  been  pro¬ 
posed  and  reviewed  [40-42].  Previous  studies  on  CVIs  have 
demonstrated  that  there  is  no  single  CVI  that  can  deal  with  any 
data  sets  and  always  perform  better  than  the  others.  But  they  are 
consistent  on  the  basic  principle  that  a  good  partition  should  have 
a  small  intra-cluster  variance  and  a  large  inter-cluster  separation 
at  the  same  time.  Here,  we  review  some  well-known  CVIs  which 
can  be  used  for  the  evaluation  of  load  classification  results  and 
present  a  brief  summaiy  shown  in  Table  2. 

4.  Applications  of  load  classification 

4.1.  Bad  data  identification  and  correction  based  on  load 
classification 

The  bad  data  existing  in  the  load  data  set  can  affect  the  correct 
decision-making  of  power  producers,  and  even  affect  the  daily 
running  and  the  safety  of  power  systems  [59].  In  smart  grid  environ¬ 
ment,  power  producers  and  managers  must  accurately  identify  and 
appropriately  process  the  bad  load  data  effectively. 

Many  studies  have  focused  on  bad  data  identification  and 
correction  based  on  load  classification.  Zhang  et  al.  [60]  presented 


an  intelligent  cleaning  model  for  bad  data  based  on  load  classifica¬ 
tion  using  Kohonen  neutral  network  optimized  by  fuzzy  soft 
clustering.  While  Wang  et  al.  [61]  identified  the  bad  data  effec¬ 
tively  using  K-means  clustering  algorithm  based  on  cluster  validity 
index,  thereby  reducing  the  undetected  and  false  detected  bad 
data.  Similarly,  the  bad  data  in  transmission  grid  state  estimation 
were  detected,  identified  and  corrected  by  K-means  algorithm 
combining  validity  index  in  [62].  Additionally,  Jiang  et  al.  [63] 
identified  the  bad  data  according  to  the  good  data  classification 
obtain  by  fuzzy  equivalent  matrix  clustering. 

Existing  studies  have  demonstrated  that  the  effectiveness  and 
practicality  of  bad  data  identification  and  correction  can  be 
improved  by  load  classification.  The  results  obtained  by  load 
classification  are  the  input  of  bad  data  identification  and  correc¬ 
tion,  and  it  is  an  important  influencing  factor. 

4.2.  Load  forecasting  based  on  load  classification 

Load  forecasting  is  a  hot  research  direction  in  demand-side 
management  of  power  systems,  especially  in  smart  grid  environ¬ 
ment,  and  various  load  forecasting  methods  have  been  proposed 
[64], 

Load  classification  can  also  be  used  for  load  forecasting,  and  the 
accuracy  of  load  forecasting  can  be  improved  supported  by  load 
classification.  Misiti  et  al.  [65]  grouped  the  global  electric  power 
information  based  on  clustering  methods,  and  then  obtained  the 
overall  forecasting  results  by  combining  the  decomposed  forecast¬ 
ing  information.  While  Li  and  Han  [66]  presented  a  load  forecast¬ 
ing  method  based  on  ant  colony  clustering,  which  can  improve  the 
accuracy  of  load  forecasting.  Also,  Jota  et  al.  [28]  gave  a  load 
forecasting  method  of  daily  load  curves  and  the  peak  load  based 
on  the  typical  daily  curve  and  the  corresponding  dynamic  load 
model  obtained  by  hierarchical  clustering. 

Load  classification  can  be  used  to  support  the  accurate  forecasting 
of  load  in  many  ways.  Such  as  forecasting  the  total  load  based  on  the 
load  of  different  types  of  users  obtained  from  load  classification.  In 
addition,  each  type  of  consumers'  typical  load  profile  or  characteristic 
load  can  be  used  as  the  input  data  of  load  forecasting. 

4.3.  Tariff  setting  based  on  load  classification 

Many  countries  are  taking  deregulation  and  open  marketing 
polices  of  electricity  market  in  smart  grid  environment.  These 
polices  can  promote  the  competitions  in  electricity  market, 
improve  the  efficiency  of  investments  and  power  systems  opera¬ 
tion,  and  reduce  costs.  In  China,  the  tariff  reform  has  become  the 


Fig.  3.  72  load  profiles. 
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core  of  the  power  systems  reform.  Developing  differentiated  and 
personalized  tariff  according  to  the  load  constitution  of  distribu¬ 
tion  network  and  the  consumption  patterns  of  different  users  is  an 
important  and  significant  research  area  [67]. 

Mahmoudi  et  al.  [68]  pointed  out  the  knowledge  of  how  and  when 
consumers  use  electricity  is  essential  to  the  retailer  in  competitive 
environment,  and  proposed  an  annual  framework  for  optimal  price 
offering  by  a  retailer  based  on  the  clustering  and  classification  of  load 
profiles  of  consumers.  While  Chicco  et  al.  [69]  analyzed  the  tariff 
setting  and  costs  of  power  distribution  companies  based  on  the 


classification  of  consumers'  load  profiles.  Huang  et  al.  [70]  presented 
a  tariff  decision-making  model  by  considering  load  classification  and 
electricity  using  characteristics  from  load  rate,  the  power  supply 
voltage  level,  the  load  shape  and  the  reliability  requirements,  etc. 
Also,  Ozveren  et  al.  [71]  proposed  a  method  for  the  automatic 
classification  of  large-sets  of  electrical  demand  profiles  using  fuzzy 
relation,  and  the  classification  results  can  be  used  by  Supply  Business 
for  tariff  development  and  end  user  costing. 

The  tariff  setting  is  a  complex  process  and  various  techniques, 
such  as  optimization,  decision-making,  and  economics,  are  required. 


Fig.  5.  Characteristic  load  profile  of  each  group. 
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Load  classification  is  an  important  decision  support  tool  for  tariff 
setting. 


5.  An  example  of  load  classification  based  on  FCM 

In  order  to  illustrate  the  process  of  load  classification,  we  present 
an  example  based  on  FCM  algorithm.  The  data  used  are  72  load 
profiles  of  6  types  of  different  electricity  consumers  in  a  city  of 
China  [72],  each  load  profile  is  a  daily  load  profile  measured  every 
one  hour.  The  load  profiles  are  shown  in  Fig.  3. 

The  parameters  in  FCM  are  set  as  follows.  The  number  of 
clusters  c  =  6,  the  fuzziness  parameter  m  =  2.5,  the  initial  cluster 
centers  are  selected  randomly,  and  the  algorithm  is  implement  50 
times  and  the  average  values  are  selected  as  the  result.  According 
to  the  process  model  described  in  Section  2,  the  load  classification 
results  are  shown  in  Figs.  4  and  5. 

As  Figs.  4  and  5  show,  the  shape  of  load  profiles,  which  indicate 
the  pattern  of  electricity  consumption  of  different  users,  are 
different.  For  example,  the  range  of  load  profiles  in  the  second 
group  is  about  0.2-1.0,  which  is  a  larger  range.  While  the  entire 
load  values  in  the  sixth  group  are  high,  with  smaller  range.  Also, 
different  from  the  other  five  groups  of  load  profiles,  there  are  two 
peaks  in  the  fifth  group,  one  of  which  appears  in  the  night. 
Additionally,  more  information  and  knowledge  can  be  discovered 
from  the  results  of  load  classification.  Based  on  these,  the  decision¬ 
making  and  policies  development  can  be  more  effective  and 
efficient  for  both  electricity  producers  and  consumers. 


6.  Conclusion 

With  the  in-depth  theoretical  study  and  widespread  application 
of  smart  grid,  load  classification  will  play  an  increasingly  important 
role  in  decision-making  of  power  systems  and  service  provision  of 
electricity  market.  Load  classification  methods  are  the  premise  and 
basis  of  load  classification,  and  the  analysis  and  applications  are  the 
ultimate  goal  of  load  classification.  The  difficulties  and  research 
directions  of  load  classification  in  smart  grid  are  as  follows. 

(1)  The  influence  of  the  complex  smart  grid  environment  to  load 
classification.  The  load  data  in  smart  grid  environment  are 
massive,  dynamic,  high-dimensional,  and  heterogeneous.  All 
these  characteristics  increase  the  difficulty  of  load  classifica¬ 
tion  in  each  process.  Such  as  the  efficient  update  of  character¬ 
istic  load  patterns  with  the  adding  and  deleting  of  consumers. 

(2)  The  study  of  efficient  and  effective  load  classification  methods. 
Traditional  clustering  algorithms,  such  as  K-means,  FCM  and 
hierarchical  method,  are  widely  used,  but  the  deficiencies  of 
these  methods  have  been  demonstrated,  which  can  signifi¬ 
cantly  affect  the  effectiveness  of  load  classification.  Moreover, 
most  traditional  clustering  methods  are  inefficient  in  dealing 
with  the  big  load  data  in  smart  grid.  Hence,  more  efficient  new 
methods  and  the  optimized  traditional  methods  should  be 
developed.  While  evaluating  and  validating  the  load  classifica¬ 
tion  results,  we  should  not  only  consider  the  values  of  CVls, 
but  also  the  characteristics  of  load  data  and  the  purpose  of 
load  classification. 

(3)  The  study  of  before-classification  preparation  and  after¬ 
classification  analysis.  In  addition  to  the  bad  data  identification 
and  processing,  data  normalization  and  the  selection  of  load 
classification  characteristics,  data  sampling  and  reduction  meth¬ 
ods  are  also  important  research  contents  in  the  preparation 
process  of  load  classification.  While  in  the  after-classification 
process,  more  effective  and  efficient  methods  about  the 


evaluation,  understanding  and  analysis  of  classification  results 
need  to  be  studied. 

(4)  The  expansion  study  of  load  classification  in  demand-side 
management  [73].  Besides  bad  data  identification  and  correc¬ 
tion,  load  forecasting  and  tariff  setting  based  on  load  classifi¬ 
cation,  there  are  more  applications  based  on  the  results  of  load 
classification,  which  are  interesting  research  directions. 
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